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FOREWORD 


This  is  the  text  of  an  invited  address,  nThe  reliability 
of  measured  values  — fundamental  concepts,'®  presented  by 
Dr » Churchill  Eisenhart,  Chief  of  the  Statistical  Engineering 
Laboratory  (Section  3 of  Division  11,  Applied  Mathematics)  of 
the  National  Bureau  of  Standards,  in  the  Symposium  on  Preci- 
sion 9 Accuracy,  and  Statistical  Method,  sponsored  by  the 
American  Society  of  Phot ogramrae try  as  part  of  the  program  of 
its  18th  Annual  Meeting,  held  in  Washington,  D0  Co,  on  9~11 
January  195>2 « 

Included  here  also  (pp«  33“4-3)  are  comments  on  Dr  „ 
Eisenhart9s  address  by  Mr  „ Ararom  Hc  Katz  (Chief  Physicist, 
Photographic  Laboratory,  Air  Materiel  Command,  Wright -Pat ter son 
Air  Force  Base,  Dayton,  Ohio),  who  organized  and  served  as 
chairman  of  the  Symposium,  and  by  Captain  Oliver  S„  Reading 
(Chief,  Division  of  Photogramme try , U0  S.  Coast  and  Geodetic 
Survey,  Washington  2£,  D„  Co),  together  with  Dr 0 Eisenhart 6 s 
replies  to  the  questions  raisedo 

This  material  will  be  published  in  due  course  in  the 
American  Journal  of  Photogramme try  as  a part  of  the  Proceed- 
ings of  the  Symposium „ 

Jo  Ho  Curtiss 

Chief,  National  Applied 

Mathematics  Laboratories 

Ao  Vo  Astin 

Acting  Director 

National  Bureau  of  Standards 
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Today  I plan  to  discuss  some  fundamental  concepts  that 
have  to  do  with  measurement  © You  will  notice  that  my  remarks 
are  highly  statistical  because  I discovered  some  years  ago 
that  the  theory  of  measurement  is  intimately  tied  up  with 
statistical  concepts  and  methods©  It  wa©  as  an  undergraduate 
major  in  mathematical  physics  at  Princeton  University  that  I 

4- 

first  began  to  give  serious  attention  to  the  theory  of  meas- 
urement , and  especially  to  that  part  of  the  subject  known  as 
the  w theory  of  errors.*’  I had  done  a little  reading  on  the 
theory  of  errors  in  connection  with  my  undergraduate  courses 
in  physics , had  found  the  subject  rather  dull,  and  very 
likely  would  have  given  it  no  further  consideration  had  it 
not  been  for  the  influence  of  Dr 0 Edward  U0  Condon,  then 
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(1930-1937)  associate  professor  of  physics  at  Princeton*  more 
recently  Director  of  the  National  Bureau  of  Standards  (194-5" 
1951), 9 and  now  Director  of  Research  and  Development  at  the 
Corning  Glass  Works 0 

Dr,  Condon*  learning  from  one  of  my  other  professors 
that  I had  shown  some  interest  in  the  theory  of  probability* 
suggested  that  I fake  a second  look  at  the  theory  of  errors 
after  familiarizing  myself  with  some  of  the  more  recent  devel- 
opments in  statistical  theory  and  methodology » To  punctuate 
his  suggestion  he  loaned  m©  his  personal  copy  of  a then  little 
known  book  by  Re  A*  Fisher*  Statistical  Methods  for  Research 
Workers  (Edinburgh;  Oliver  | Boyd*  1st  ed„*  19 25;  11th  ©d0* 
1950) o After  perusing  this  volume  for  quite  aorae  time  I re- 
turned to  Dr,  Condon  with  (a)  a confession  to  the  effect  that 
1 had  been  unable  to  determine  the  mathematical  basis  of  much 
that  I had  read  in  this  book*  and  (b)  a conviction  that* 
assuming  this  book  to  be  sound*  it  carried  a great  message  to 
experimental  physicists  and  chemists  who  conduct  and  inter- 
pret experiments  involving  only  a small  number  of  observations 
and  that  a ^translation1*  of  its  message  should  be  made  avail- 
able to  physical  scientists  without  delay c Dr 0 Condon  replied 
that*  with  regard  to  the  difficulty  experienced  in  fathoming 
the  mathematical  basis  of  the  book*  I was  not  alone;  that  he 
believed  the  book  to  be  sound;  and  **How  about  my  undertaking 
the  8 translation5  ?’*  That  was  the  beginning,.  As  my  first 
effort*  I wrote*  in  my  junior  year  at  Princeton*  an  essay 
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entitled  *5A  Discussion  of  ! Student 8 s 5 Method  for  Testing  the 
Significance  of  a Small  Number  of  Observations.4*  (For  this 
essay  Mr.  Eisenhart  was  awarded  the  William  Marshall  Bullitt 
Pri$®  in  Mathematics , by  Princeton  University , in  June  1933 » 
EDITOR »)  I have  been  working  on  various  facets  of  the 
translation'*  ever  since , gaining,  in  the  process,  I believe, 
somewhat  greater  insight  into  the  theory  of  measurement  as  a 
whole  o 

Before  delving  Into  fundamental  concepts  and  principles 
of  the  theory  of  measurement  as  I see  it,  I wish  to  tell  a 
story  that  has  at  least  two  messages  for  us  here  todays  The 
story  has  to  do  with  Coca-Cola  vending  machines  on  a particu- 
lar Pacif-ic  island  serving  as  a military  base  . These  machines, 
unless  empty,  would  automatically  emit  one  bottle  of  Coca-Cola 
for  each  nickel  inserted  in  the  slot  provided.  But  the  costs 
had  risen  and  bottles  of  Coca-Cola,  according  to  the  story, 
were  now  to  be  retailed  at  6 cents  a bottle.  Unfortunately, 
the  machine  would  accept  only  nickels.  Experience  revealed 
that  the  G-1 8 s would  not  put  a penny  in  every  instance  into  a 
little  box  that  was  placed  there  for  the  purpose.  The  mili- 
tary police  did  not  have  enough  staff  to  have  somebody  posted 
there  to  see  that  the  GI?s  did  put  a penny  in  the  box.  So, 
an  operations  analyst  on  the  post  was  called  in  for  advice. 

He  had  a statistical  flavor  in  his  background.  He  said  that 
they  should  simply  fill  the  machine  up,  but  on  the  average 


put  an  empty  bottle  in  every  sixth  place.  In  this  way  empty 
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bottles  will  be  arranged  in  the  machine  at  random  — the 
perfect  strategy®  If  the  empties  were  arranged  in  a system- 
atic way,  clever  GJ's  would  wait  until  somebody  had  bought  an 
empty  bottle  and  then  they  would  charge  the  machine,  and 
their  enthusiasm  added  to  his  fury  might  be  too  much  for  the 
machine o By  putting  empties  in  at  random,  nobody  could  out- 
guess the  machine  and  all  would  be  on  an  equal  footing 8 The 
operations  analyst  certainly  solved  the  seller9 s problem,  in 
the  sense  that  the  seller  was  going  to  get  his  money  — 6 
cents  per  full  bottle  on  the  average  — » but  he  didn't  have 
much  of  a heart  for  the  purchasers  on  the  other  end  of  the 
dealo  You  may  say  that  everything  was  all  right,  because 
everyone  who  used  the  machine  would  be  fairly  treated  in  the 
long  run  = provided  he  bought  enough  Coca-Colas  — and  by 
the  strong  law  of  large  numbers  the  heavier  the  drinker,  the 
more  nearly  certain  that  he  would  be  getting  cokes  for  6 cents 
per  drink o But  the  fellow  on  the  island  for  just  one  day, 
with  only  one  nickel  to  spend,  he  is  either  going  to  be  lucky 
or  unlucky o What  about  him?  Is  he  treated  fairly? 

This  story  has  two  messages  for  us g First,  correctness 
on  the  average  does  not  guarantee  satisfactory  outcomes  in 
individual  cases.  Thus,  a single  observation,  or  the  average 
of  only  two  or  three,  should  not  be  used  as  if  it  were  the 
average  of  a great  many,  without  careful  justif ication0 
Second,  a consultant  should  consider  his  client's  problem  in 
its  entirety,  and  help  reach  a full  solution  that  is  Mbestw 


from  his  e!ient8s  viewpoint  -=»  he  should  not  foist  upon  his 
client  a clever  solution  to  only  a part  of  the  problem,  leav- 
ing the  client  to  take  the  rap  unprotected  when  the  “worst** 
eventuality  actually  happens 0 Some  treatments  of  the  theory 
of  errors  fall,  I fear,  in  this  latter  category 0 

Now,  with  these  casual  introductory  remarks  off  my  chest, 

I must  buckle  down  to  the  serious  business  of  my  assigned 
topic  — 

What  is  measurement?  Briefly  stated,  measurement  is  a 
process  consisting  of  a sequence  of  steps  or  operations  that 
yield  as  an  end  result  a number  that  serves  to  represent  the 
amount,  the  degree,  extent,  magnitude,  or  quantity  of  some 
property  of  a thing  — a number  that  provides  an  answer  to 
the  question  8,how  much?**  for  someone  to  use  for  a specific 
purpose  o The  purpose  for  \jhich  the  answer  is  needed  deter- 
mines the  method  of  measurement  employed!  that  is,  the  sequence 
of  operations  by  which  the  number  is  to  be  obtained!  and  also 
the  precision  and  accuracy  that  are  requisite 0 

When  a magnitude  is  determined  by  the  use  of  instruments 
whose  indications  yield  directly  the  numerical  value  of  the 
magnitude,  the  process  is  called  direct  measurement;  and  the 
result  obtained,  a direct  measurement , Examples  are?  measure- 
ment of  a length  by  a scale,  of  mass  by  a balance,  of  electri- 
cal resistance  by  a Wheatstone  bridge,  and  of  period  of  time 
by  a clock0  On  the  other  hand,  when  determination  of  the 
magnitude (s)  of  one  or  more  directly-measured  quantities  that 


bear  a known  relationship  to  the  quantity  under  investigation* 
the  process  is  called  indirect  measurement ; and  the  result 
obtained*  an  indirect  or  derived  measurement.  If*  for  exam- 
ple* the  volume  of  a spherical  ball  is  computed  from  a direct 
measurement  of  its  diameter*  by  means  of  the  formula 
V - tid3/6*  the  result  is  a derived  measurement.  If*  on  the 
other  hand*  the  volume  were  determined  by  measuring  directly* 
with  a graduated  vessel*  the  volume  of  liquid  it  displaces 
from  a filled  container*  the  result  would  be  a direct  meas- 
urement* even  though  arrived  at  by  a roundabout  procedure. 
Today  I shall  limit  my  discussion*  for  convenience*  to 
direct  measurements*  and  direct-measurement  processes® 

A direct -measurement  process  is  essentially  a production 
process  * the  !}productn  being  the  numbers*  that  is*  the  meas- 
urements* it  yields.  There  are  two  aspects  of  this  process* 
the  quantitative  and  the  qualitative.  The  quantitative  aspect 
consists  of  the  readings  or  the  observations  themselves*  which 
are  the  end  product  of  the  process  and  are  in  the  form  of 
numbers , The  qualitative  aspect  consists  of  the  manipulation 
of  an  instrument  (or  apparatus)  and  the  taking  of  readings  by 
s ome one  * or  by  an  automatic  recording  device*  under  prescribed 
conditions  in  accordance  with  specific  instructions  (i,e,* 
rules  of  procedure).  Thus*  the  factors  that  enter  into  the 
measurement  of  any  quantity  are  the  observer -apparatus  combi- 
nation employed  {i0e,*  the  person(s)*  the  apparatus*  and  all 


th©  auxiliary  materials*  such  as  reagents*  sources  of 


illumination,  etc0)s  the  conditions  under  which  the  measure 


raent  operations  are  carried  out,  and  the  instructions 
followed o 1/ 

A characteristic  of  direct  measurement  is  the  disagree- 
ment of  repeated  measurements  of  allegedly  the  same  quantity 0 
Experience  shows  that  repeated  measurement  of  the  same  magni- 
tude generally  results  in  a series  of  non-identical  numbers,, 
To  explain  these  discordances  we  introduce  the  concept  of 
errors,  which  we  interpret  to  be  the  manifestations  of  varia- 
tions in  the  execution  of  the  process  of  direct  measurement 
resulting  from  tsthe  imperfections  of  instruments,  and  of  the 
organs  of  sense,"  and  from  the  impossibility  of  achieving  (or 
even  specifying  with  a finite  number  of  words)  the  ideal  of 
perfect  control  of  conditions  and  procedure « 

It  is,  of  course,  highly  desirable  that  our  measurements 
be  reliable,  by  which  I mean  not  that  they  are  totally  free 
from  error  — this  we  can  never  achieve  — but  simply  that 


such  errors  as  they  do  contain  are  negligible  in  the  sense 
that  decisions  or  conclusions  based  upon  the  measurements  as 
they  stand  will  not  differ  in  any  important  respect  from  the 


decisions  or  conclusions  that  would  follow  if  the  errors  they 
contained  could  be  and  were  removed 0 S/ 

^For~Tn  excellent  discussion  of  the  qualitative  aspects  of 
measurement  from  an  operational  point  of  view,  see  W«  Ac 
Shewhart , Statistical  Method  from  the  Viewpoint  of  Quality 
Control  (edited  by  W0  Edwards  Darning)  , The  Graduate  School, 
TJo  So  Department  of  Agriculture,  Washington,  1939#  po130ffo 

^/compare  Shewhart,  loc0  cit«,  Rule  1 (p0  88)  and  Rule  2 
(P«  92) o 


Th©  degree  of  reliability  required  of  a set  of  measure - 
merits  depends  primarily  on  the  uses  for  which  they  are  intend- 
ed, but  one  should  not  ignore  the  requirements  of  other  uses 
to  which  they  are  likely  to  be  put 0 A set  of  measurements 
whose  reliability  is  unknown  is  worthless | worse 9 it  may  be 
dangerous « A man  is  to  be  pitied  who  must  of  necessity  reach 
a decision  in  some  matter  and  to  guide  him  has  only  data  of 
inadequate  or  unknown  reliability 0 In  such  a case  he  is 
forced  to  act  much  as  did.  Steyning  in  Chapter  VI  of  Kipling 8 s 
story  Captains  Courageous g M Steyning  tuk  him  for  the  reason 
that  the  thief  tuk  the  hot  stove „ — bekaze  for  there  was 
nothing  else  that  season 0S,) 

The  reliability  of  a set  of  measurements  as  a basis  for 
decision  in  some  particular  respect  is9  strictly  speaking, 
unknowable,  but  can  usually  be  inferred  — but  not  without 
some  risk  of  being  incorrect  «=■  from  the  estimated  precision 
and  conjectured  limits  to  the  possible  bias  , that  is,  from 
the  inferred  accuracy,  of  the  process  by  which  the  measure- 
ments were  obtained 0 By  the  bias  „ or  systematic  error,  of  a 
direct-measurement  process  we  mean  the  magnitude  of  its 
tendency  to  measure  something  other  than  what  was  intended  £ 
by  its  precision,  the  closeness  together,  that  is,  the  degree 
of  agreement  amongst,  repeated  measurements  of  the  same  fixed 
quantity^  and  by  its  accuracy , the  comprehensive  term,  the 
closeness  of  such  measurements  to  the  actual  magnitude  con- 
cerned o It  is  most  unfortunate,  I feel,  that  in  popular 


parlance  we  often  talk  of  ’^accuracy  and  precision, n because 
accuracy  includes  ^precision/®  but  the  converse  is  not 
necessarily  true 0 - It  is  less  confusing , therefore,  if  we 
talk  about  mutually  distinct  concepts  such  as  precision  and 
bias  (or  systematic  error , as  it  is  often  termed 9 with  less 
stigma,  perhaps) o Indeed,  if  I succeed  today  in  accomplish- 
ing  no  more  than  making  clear  to  you  the  distinction  between 
these  three  terms  — precisian,  bias,  and  accuracy  — our 
time  together , I feel,  will  have  been  well  spent . I hope , 
however , to  accomplish  a bit  more  if  time  permits » 

The  distinction  between  accuracy  and  precision  as  applied 
to  measurement , measurement  processes , and  measuring  instru- 
ments , is  as  follows? 

(1)  The  accuracy  of  a measurement  process  pertains  to 
degree  of  conformity  to  t he  truth  of  measurements  gen- 
erated by  repeated  applications  of  the  process  under 
fixed  circumstances  * 

(2)  The  precision  of  a measurement  process  pertains 
solely  to  the  degree  of  conformity  of  the  measurements 
among  themselves g and  hence  to  the  degree  of  their  con- 
formity to  the  average  value  characteristic  of  the 
process  in  the  particular  circumstances  concerned,  quite 
irrespective  of  whether  this  average  value  is  or  is  not 
the  8 true  value 8 „ 

In  other  words,  accuracy  refers  to  the  closeness  of  the  meas- 
urements to  the  8 true  value8  = closeness  to  some  reference 
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or  standard  value  accepted  as  the  truth  — whereas  precision 
refers  merely  to  their  closeness  together «,  Thus , accuracy 
expresses  a relation  to  a value  external  to  the  measurement 
process;  precision,  to  a value  internal  to  the  process 0 

An  accurate  method  of  measuring  some  quantity  is,  there- 
fore, a method  that  is  both  precise  and  unbiased 9 in  the  sense 
that  it  yields  measurements  that  are  closely  clustered  and 
centered  on  the  9 true  value9.  (Such  a situation  is  portrayed 
in  the  upper  left-hand  quadrant  of  Figure  10)  If  the  meas- 
urements are  closely  clustered,  but  centered  on  some  value 
other  than  the  8 true  value9  , then  the  method  is  precise , but 
biased,  and  hence  inaccurate B (The  upper  right-hand  quadrant 
of  Figure  1 portrays  such  a situation,,)  If  the  measurements 
are  widely  scattered,  but  nevertheless  are  centered  on  the 
9 true  value9,  the  method  is  unfa las® dp  but  imprecise , and  hence 
inaccurate . (This  situation  is  depicted  in  the  lower  left- 
hand  corner  of  Figure  10)  Finally,  if  the  method  is  both 
biased  and  imprecise  „ it  is  a fortiori  inaccurate „ (The  lower 
right-hand  quadrant  of  Figure  1 illustrates  this  situation, ) 3/ 
From  what  I have  just  said,  however,  I most  certainly 
do  not  want  you  to  infer  that  an  unbiased  procedure  is  always 
to  be  preferred  to  a biased  one  0 Indeed,  a.  procedure  with  a 
small  bias  and  a high  precision  can  be  more  accurate  than  an 
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You  will,  I believe,  find  that  the  foregoing  distinctions 
are  consistent  with  those  drawn  between  accurate  and 


precise  in  the  synonymic  article  that  appears 
ts c or rect. n in  Webster 9 s New  International  Diet! 


under 
onary , 


UN-PRECISE  PRECISE 


- a - 


UN-BIASED  BIASED 


TRUE  VALUE  TRUE  VALUE 


INTER-RELATIONS 

o F 

BIAS.  PRECISION  & ACCURACY 


figure  1. 


UNBIASED  PROCEDURE 
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unbiased  procedure  of  low  precision,,  (See  Figure  2.)  It  is 
important  to  realize  this,  for  in  practical  life  it  Is  often 
far  better  to  always  be  quite  close  to  the  true  value  than 
to  deviate  all  over  the  place  in  individual  cases  but  be 
strictly  correct  on  the  average » Consider  carpentry?  I sin- 
cerely doubt  whether  even  the  best  of  carpenters  hit  nails 
with  absolutely  no  bias  (up,  down,  right,  or  left)  on  the 
average  g but  good  carpenters  surely  don’t  miss  the  nail  alto- 
gether very  often 5 and  are  certainly  to  be  preferred  to  an 
imprecise  but  well  balanced  novice  who  hits  most  every  spot 
within  six  inches  of  the  nail,  with  absolutely  no  bias  in  the 
long  run,  but  rarely  if  ever  hits  the  nail  itself.  This  we 
must  remembers  in  practical  life  we  rarely  make  a very  large 
number  of  decisions  of  a given  type  — we  can’t  wait  to  be 
right  on  the  average  — our  decisions  must  stand  up  in  indi- 
vidual cases  as  often  as  possible l 

Despite  the  foregoing,  freedom  from  bias,  that  is,  free- 
dom from  9 large 8 bias,  is  a desirable  characteristic  of  a 
measurement  process.  After  all  we  want  our  measurements  to 
yield  us  a determination  that  we  can  use  as  a substitute  for 
the  unknown  value  of  a particular  magnitude  whose  value  we 
need  for  some  purpose  — we  don8t  want  a determination  of  the 
value  of  some  other  magnitude  whose  relation  to  the  one  we 
need  is  indefinitely  known. 


It  is  clear  from  what  I have  said  earlier  that  the  prob- 
lem of  bias  9 or  systematic  error 0 would  be  licked  if  we  could 
be  sure  that  a particular  direct  “-measurement  process  measured 
exactly  what  was  intended „ This  goal  is  In  effect  achieved 
in  the  writing  of  performance  specifications  for  materials 
and  products  by  including  within  the  performance  specifica- 
tion itself  a detailed  specification  of  how  a particular  mag- 
nitude is  to  be  measured 9 or  by  referencing  a specific  method 
of  measurement  given  in  some  supplementary  document f and  then 
accepting  the  method  of  measurement  so  prescribed  as  defining 
the  “true  value1*  of  the  quantity  concerned  for  the  purposes 
of  the  performance  specification  itself.  Thus p in  the  Speci- 
fications for  Government  Synthetic  Rubbers 9 the  tolerances 
stated  for  the  “viscosities’*  of  the  several  synthetic  rubbers 
relate  to  “viscosity1*  as  defined  by  the  method  of  measurement 
spelled  out  in  detail  elsewhere  in  these  Specifications;  and s 
In  the  new  Federal  Specification  for  “Rubber;  Cellular 8 Latex 
Foam/*  it  is  stated  that  “latex  foam  rubber  shall  show  a com- 
pression set  not  greater  than  20  percent  when  tested  as  des- 
cribed in  [Section]  433/*  which  is  entitled  ’‘Compression  set5’1 
and  states  that  “the  compression  set  shall  be  determined  as 
described  in  method  11005**  of  Federal  Specification  ZZ-R-601? 
Rubber  Goods;  General  Specifications  (Methods  of  Physical 
Tests  and  Chemical  Analyses) 0 
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This  operational  approach  to  the  definition  of  the  true 
values  of  physical  and  chemical  quantities  brings  us , however, 
face  to  face  with  another  fundamental  questions  in  what  sense 
can  we  say  that  a particular  direct-measurement  process  de- 
fines an  unique  magnitude,  the  value  of  the  quantity  so  deter- 
mined, when  experience  shows  t hat  repeated  application  of  the 
process  under  fixed  circumstances  yields  a sequence  of  non° 
identical  numbers?  What  is  the  value  thus  defined? 

The  answer  takes  the  form  of  a postulate  about  direct- 
measurement  processes  that  has  been  expressed  by  N0  Ernest 
Dorsey  (on  pa  4 of  his  ’’Velocity  of  Light,”  Transactions  of 
the  American  Philosophical  Society,  3^,  1-109?  October  1944) 
as  follows : 

nThe  mean  of  a family  of  measurements  — of 
a number  of  measurements  of  a given  quantity 
carried  out  by  the  same  apparatus,  procedure, 
and  observer  — approaches  a definite  value 
as  the  number  of  measurements  is  indefinitely 
increased o Otherwise,  they  could  not  proper- 
ly be  called  measurements  of  a given  quantity* 

In  the  theory  of  errors,  this  limiting  mean  is 
frequently  called  the  9 true » value , although  it 
bears  no  necessary  relation  to  the  tru®  quae- 
sitma,  to  the  actual  value  of  the  quantity 
that  the  observer  desires  to  measure 0 This 
has  often  confused  the  unwary „ Let  us  call  it 
the  limiting  mean,” 

In  my  lectures  at  the  National  Bureau  of  Standards,  and 
elsewhere,  I have  termed  this  — or  rather  a slightly  re- 
phrased version  of  it  — the  Postulate  of  Direct  Measurement, 

A mathematical  justification  for  it  can  be  found  in  the 
Strong  Law  of  Large  Numbers,  a theorem  in  the  mathematical 
theory  of  probability  discovered  during  the  present  century 0 


- 16  - 


Furthermore  9 consideration  of  the  conditions  under  which  the 
Strong  Law  is  valid  furnishes  an  indication  of  the  circum- 
stances under  which  the  Postulate  of  Direct  Measurement  is 
likely  to  be  effective  in  practice®  I 

It  will  suffice  here  today  for  us  to  note  that  the  sole 
aim  of  the  Postulate  of  Direct  Measurement  is  axiomatic  ac- 
ceptance of  the  existence  of  a limit  approached  by  the  arith- 
metic mean  of  a finite  number  n of  measurements  generated  by 
any  direct  “measurement  process  as  n co|  and  it  should  be 
noted  that  it  says  nothing  on  how  the  ’’best”  estimate  of  this 
limiting  mean  is  to  be  obtained  from  a finite  number  of  such 
observations®  The  Postulate  is  an  answer  to  the  need  of  the 
practical  man  for  a justification  of  his  desire  to  consider 
the  sequence  of  non-i&antie&X  numbers  that  he  obtains  when  he 
attempts  to  measure  a quantity  l}by  the  same  method  under  like 
circumstances’1  as  pertaining  to  a single  magnitude  9 in  spite 
of  the  evident  discordance  of  its  elements®  The  Postulate 
aims  to  satisfy  this  need  by  telling  him  that  if  he  were  to 
continue  taking  more  and  still  more  measurements  or  observa- 
tions ?9by  the  same  method  under  like  circum.stances,,  ad  infin- 
itums  and  were  to  calculate  their  cumulative  arithmetic  means 
at  successive  stages  of  this  undertaking , then  he  would  find  ^ 
that  the  successive  terms  of  this  sequence  of  cumulative 
arithmetic  means  would  settle  down  to  a narrower  and  ever 
narrower  neighborhood  of  some  definite  number  which  he  could 
then  accept  as  the  value  of  the  magnitude  that  his  first  set 
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of  measurements  or  observations  were  striving  to  express. 

The  foregoing  can  be  expressed  mathematically  as  follows? 
on  seme  particular  occasion,  say  the  wi"khw,  we  may  take  a 
number  of  successive  measurements  by  a given  direct “measure™ 
ment  process  under  certain  specified  circumstances,,  Let 
(1)  xn,  xi2,  o.o 


denote  the  sequence  of  measurements  so  generated.  Conceptually 
at  least,  this  sequence  could  b©  continued  indefinitely.  Like- 
wise, on  different  occasions  we  might  start  a new  sequence, 
using  the  same  measurement  procedure  and  applying  it  under  the 
same  fixed  set  of  circumstances.  Each  such  fresh !,startM  would 
correspond  to  a different  Y&Xue  of  wi,w  If,  as  we  shall  assume, 
the  measurement  process  concerned  when  applied  under  these 
circumstances  obeys  the  Strong  Law  of  Large  Numbers,  i0eo,  if 
the  Postulate  of  Direct  Measurement  is  applicable,  it  follows 
that  we  may  expect  the  sequence  of  cumulative  arithmetic 
means  on  the  i^*1  occasion,  namely, 

(2)  xin  = (xil+xi2+c o «+xin)/n  , (n=l,2,,,o)  , 

to  converge  to^u,  a number  that  constitutes  the  limiting  mean 
associated  with  this  direct “measurement  process  under  the 
circumstances  concerned,  but  independent  of  the  ^occasion,81 
that  is,  independent  of  the  value  of  ,,i0w  The  Strong  Law  of 
Large  Numbers  (see,  for  example,  William  Feller,  An  Introduc- 
tion to  Probability  Theory  and  its  Applications,  vol , 1 , New 
York?  John  Wiley  £ Sons,  Inc,,  1950?  P«  207)  does  not  guaran- 
tee that  the  sequence  (2)  for  a particular  value  of  i will 
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converge  to  11  as  the  number  of  observations  n on  this  occasion 
tends  to  infinity,  but  simply  states  that-  among  the  family  of 
such  sequences  corresponding  to  a large  number  of  different 
starts,  (i=l,2,.»o),  the  instances  of  non- convergence  to  jol  ♦ 

will  be  rare  exceptions „ In  other  words,  in  practice  one  is 
almost  certain  to  be  working  with  a •'good11  sequence  — one 
for  which  (2)  would  converge  to^a.  if  the  number  of  observa- 
tions were  continued  indefinitely  «= , but  HbadH  occasions  can 
occur,  though  rarely 0 Thus,  the  Postulate  of  Direct  Measure- 
ment expresses  something  better  than  an  Mon-the -average” 
property  — it  expresses  an  ”In-almost -all ceases”  property. 
Furthermore,  this  limiting  mean yU,  the  value  of  which  each 
individual  measurement  x is  trying  to  express,  can  be  regard- 
ed as  the  mean  or  '’center  of  gravity”  of  the  infinite  concep- 
tual population  of  all  measurements  x that  might  conceivably 
be  generated  by  the  direct-measurement  process  concerned  under 
the  specified  circumstances c 

With  this  as  background,  we  are  now  in  a position  to 
consider  the  mathematical  definition  of  the  precision  of  a 
direct-measurement  process  under  a fixed  set  of  circumstances. 

By  definition,  the  precision  of  the  process  has  to  do  with 
the  closeness  together  of  the  individual  measurements  generated  % 
by  the  process  under  these  fixed  conditions.  Otherwise  ex- 
pressed, it  has  to  do  with  the  closeness  together  of  the  two 
individual  measurements  constituting  an  arbitrary  pair . Let 


us  assume  for  the  moment  that  under  the  circumstances  chosen 
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the  direct-measurement  process  gives  rise  to  a sequence  (1) 
of  completely  homogeneous  measurements  — the  full  meaning 
and  import  of  this  qualification  will  become  apparent  as  we 
proceed o Let  us  now  consider  the  individual  measurements  (1) 
to  be  grouped  arbitrarily  into  pairs  giving  rise  to  the  de- 
rived sequence  of  differences 

{ 3 ) dq  J 6-2  y <*  ® O } dy-^  , O o o 

where  the  additional  subscript  i has  been  omitted  for  con- 
venience o Some  of  these  differences  will  be  positive,  and 
others  negative,  and  it  is  not  difficult  to  show  that  whatever 
be  the  (finite)  value  of  the  limiting  mean^u  associated  with 
the  sequence  (1),  the  limiting  mean  5 associated  with  the  se- 
quence of  d«s,  (3),  will  be  identically  0o  Consequently,  the 
limiting  mean  of  these  differences  is  utterly  useless  as  a 
measure  of  compactness  of  the  original  sequence  (1) 0 On  the 
other  hand,  it  is  clear,  I believe,  that  just  as  each  individ- 
ual measurement  x is  striving  to  express  the  value  of  the 
limiting  meanyu,  so  also  is  each  of  the  differences  d,  if  its 
sign  be  neglected,  striving  to  express  the  characteristic 
spread  between  two  arbitrary  measurements  i„ 

To  get  rid  of  the  signs  of  the  d»s,  let  us  therefore, 
consider  instead  of  (3)  the  sequence 

()-}-)  dq^  , , o o o , d-Q^' , coo 

of  the  squares  of  these  differences  between  arbitrary  pairs 0 
It  is  not  difficult  to  determine  conditions  on  the 
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measurements  themselves , that  is  the  x’s,  sufficient  to  ensure 
that  the  sequence  (l|_)  of  the  d^«s  values  will  also  obey  the 
Strong  Law  of  Large  Numbers , and  be  associated  with  a limit- 
ing mean,  say  = 2o^„  where  o,  termed  the  standard  deviation  ( 
of  the  measurements  themselves  (the  x’s) ? is  simply  the 
radius  of  gyration  of  th®  aforementioned  infinite  population 
of  x-values  about  its  center  of  gravity^*  in  other  words , 
o'"  is  simply  the  average  value  of  (x~/i) ~ in  this  infinite 
conceptual  population  of  possible  measurements  x„ 

Since  the  precision  of , the  process  obviously  decreases 
as  the  value  of  o increases , and  vice  versa,  it  is  natural  to 
take  some  inverse  function  of  £ as  a measure  of  precision,, 

Thus,  Gauss  (c„f.  Gauss,  Theoria  Motus  Corporum  Coelestium  In 
Sectionibus  Conicis  Solem  Ambient! urn,  Hamburg,  1009 , Article 
I.78)  adopted  as  his  modulus  of  precision  the  quantity  h 
~ l/o/S  , which  we  see  to  be  the  square  root  of  the  recipro- 
cal  of  the  limiting  mean  of  the  d^  sequence  (4) 0 

Mathematically  the  foregoing  discussion  can  be  carried 
out  equally  well  in  terms  of  the  absolute  (un-signed)  values 
of  the  d J s , instead  of  in  terms  of  their  squares 0 Such  an 
ap  proaeh,  however,  has  several  disadvantages*  In  the  first 
place,  the  limiting  mean,  say  5%  with  a sequence  analogous  ^ 

to  (3)  but  in  vh  ich  the  signs  of  the  des  are  ignored,  .lias  a 

_ V> 

less  vivid  geometrical  — or  should  I say  mechanical  «=  inter- 
pretation than  has  oj  and  8 5 works  out  to  be  equal  to  ao, 
where  the  constant  a depends  on  the  particular  functional 
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form  of  the  distribution  of  the  measurements  x about  their 
limiting  mean yu0  Secondly 9 as  we  shall  now  see^  components 
of  error  are  additive  in  terms  of  squared  quantities  such  as 
9 so  that  in  this  sense  is  a more  appropriate  measure  of 
the  dispersion  of  the  x"s  about  their  limiting  mean yu  than  is 
a itself  or  any  constant  multiple  of  it a 

In  the  foregoing  we  assumed  for  convenience  that  the  in- 
dividual measurements  forming  the  sequence  (1)  were  completely 
homogeneous  a In  practice  this  is  rarely  the  case  and  a more 
common  situation  is  that  in  which  a sequence  (1)  consists  of 
a series  of  ^sections”  with  the  measurements  in  any  one  sec- 
tion being  homogeneous  with  respect  to  each  other 9 and  pair- 
wise more  close  together  on  the  average  than  two  measurements 
one  of  which  comes  from  one  section  and  the  other  from  another „ 
In  the  simplest  of  such  cases 9 if  we  form  a sequence  such  as 
(ij.)  composed  of  the  squares  of  differences  between  arbitrary 

pairs  of  measurements  from  within  each  of  the  respective  see- 

o 

tlons  9 then  the  limiting  mean  of  such  a sequence  of  d'“”s  will 
be  of  the  form  2gw^,  where  ow^  is  the  within-group  variance » 

If,  on  the  other  hand,  we  form  arbitrary  pairs  consisting  of 
one  measurement  from  each  of  two  different  sections,  then  the 
limiting  mean  of  a sequence  of  such  d^ffs  will  be  of  the  form 
2(aw^+o- b^)  where  ab~  is  the  between-group  variance „ 

In  such  a situation,  if  xn  is  in  fact  the  average  of  a 
total  of  n = km  measurements,  composed  of  m measurements  from 
each  of  k different  sections,  then  over  a (infinitely)  large 


number  of  such  experiments , i„@o,  different  "starts,"  th© 
average  value  of  (xn=yQ)  ^ will  be 


(5) 


» 


from  which  it  is  clear  that , if  oiy  is  at  all  sizeable  com- 
pared to  aw^,  then  3^,  for  fixed  n - km,  will  have  greater 
precision  if  based  on  a large  number  k of  different  sections , 
with  only  a small  number  m of  values  from  each  section,, 

Th©  foregoing  can  be  interpreted,  I believe , in  the  lan= 
guage  of  your  science,  as  I understand  it,  somewhat  as  follows 
Let  "sections"  correspond  to  different  prints  from  a photo- 
graphic negative  so  that  measurements  within  a particular 
"section**  are  repeated  determinations  of  the  distance  between 
two  points  on  the  ground,  say,  as  determined  through  sucees- 
sive  measurements  of  this  distance  on  this  single  print.  Let 
us  now  suppose  further  that  there  are  two  points  on  the  first, 
th©  distance  between  which  on  the  ground  is  accurately  known, 
having  been  determined  to  everybody8s  satisfaction  with,  suffi- 
cient accuracy  by  the  man  on  the  ground  with  the  invar  tape 
and  other  auxiliary  apparatus  of  a professional  surveyor. 

Let  the  true  value  of  this  distance  be  denoted  by  X and  let 
us  suppose  that  the  problem  is  to  determine  th©  distance  be- 
tween two  other  points,  the  true  value  of  which  is,  say,  Y0 
One  method  of  doing  this  would  be  to  make  successive  indepen- 
dent measurements  x-.  , x - , x^  9 0 0 0 , of  the  distance  between 

the  two  standard  points  on  the  photographic  print,  and  an 
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equal  number  of  independent  measurements  yq , y^*  « « « , ym  of 
the  distance  between  the  other  two  points  on  this  same  photo- 
graphic print o One  could  then  take  as  an  estimate  of  Y the 
quantity 

(6)  Y - X + (y-I)  , 

where  y and  x are  the  averages  of  the  y and  x determinations 
respectively,  which  we  assume,  are  all  mutually  independent 
and  have  precision  implied  by  individual  variances  of  , 

The  variance  of  Y as  an  estimator  of  Y will  then  be  2a^/m  „ 
However,  Y may  be  a biased  estimator  of  Y,  the  magnitude  of 
the  bias,  pq,  being  a property  of  this  first  print,.  One  could 
check  this  latter  by  calculating  a Yq  from  each  of  k different 
plates  ( 1=1,2, o o • ,k)  and  then  checking  to  see  whether  the 
quantity 

(7)  2 (Yq-Y)  /(k-1) 
i=l 

where  Y is  the  average  of  the  Yq,  is  significantly  larger  than 
2ow^/m  e If  it  is  significantly  too  large  «—  and  we  have  sta- 
tistical tests  for  answering  this  question  — then  we  must 

2 2 

conclude  that  (6)  is  not  an  estimator  of  Qf  , 

m u m 

where  o\^  denotes  the  variance  of  the  biases  Pq,  p£,  ® ° » 
about  their  limiting  mean  (3*  , say„  The  bias  of  the  photo- 
print  method  is  measured  by  p&-„  In  such  a case  the  variance 
of  the  over -all  average  Y obtained  from  all  prints  will  be 
given  by 
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(8)  i(ab2+z2s!,) 

* ra 

where  the  additional  ,,2n  — in  comparison  to  (5)  — comes  from 
the  fact  that  on  each  print  the  comparison  of  measurements  on  ( 
two  different  distances  is  involved*  Clearly,  whether  it  is 
desirable  to  take  a large  number  of  measurements  on  only  a few 
prints,  or  only  a few  measurements  on  each  of  a large  number 
of  prints,  will  depend  on  the  relative  magnitudes  of  and 
ow2  * Furthermore,  it  is  evident  that  instead  of  considering 
different  prints  from  the  same  photographic  plate,  one  might 
consider  instead  different  photographic  plates  obtained  on 
different  flights  over  the  region  by  an  airplane,  and  so 
forth  and  so  on0 

In  applied  science  one  often  speaks  of  '’repeating  the 
determination*1  of  some  quantity „ In  a setting  such  as  the 
foregoing  one  should  be  clear  on  exactly  what  one  means  by  a 
’’repetition  Does  it  mean  more  measurements  of  the  same 
kind  on  the  same  print  by  the  same  observer-equipment-procedure 
combination,  or  would  the  same  observer -equipment-procedure 
combination  be  employed  using  different  prints;  or,  would 
various  but  equivalently-trained  observers  be  employed  with 
the  same  or  similar  equipment,  using  the  same  or  similar  pro-  t 
cedures,  at  the  same  or  various  places,  using  the  same  or 
different  prints,  from  the  same  or  different  negatives;  and 
so  forth?  Clearly  it  is  not  possible  to  talk  ambiguously 
about  the  precision  of  a particular  method  of  measurement 
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without  indicating  the  character  of  the  ’’repetitions”  involved 
in  generating  the  sequence  of  like  measurements  with  respect 
to  which  the  ’’precision”  is  supposed  to  apply „ 

There  is  also  the  problem  of  how  to  proceed  in  making 
’’repeated  measurements’1  so  that  the  results  obtained  will  be 
independent  in  the  statistical  sensei  if  one  is  measuring  the 
distance  between  two  points  on  a print  with  the  same  calibrated 
seal©  over  and  over  again * it  is  exceedingly  difficult,  if  not 
impossible*  to  obtain  independent  readings  unless  one  deliber- 
ately introduces  a random  positioning  of  the  scale  in  each 
instance.  Alternatively*  we  might  use  a series  of  different 
graduated  scales  of  the  same  general  type  but  which  had  differ- 
ent- calibration  corrections „ In  this  way  one  might  help  the 
’’rounding  errors”  and  ’’reading  errors”  to  balance  out.  The 
use  of  measuring  rods  with  unevenly  spaced  scale  divisions 
for  this  purpose  was  discussed  by  P0  C,  Mahalanobis , F9R0So* 
in  a lecture  at  the  National  Bureau  of  Standards*  on  November 
13?  a summary  of  his  lecture  will  be  found  in  the  ASTM 

Bulletin  for  January  191}.?*  pp , 6Lj.~66,  I commend  this  matter 
to  your  attention,, 

Finally*  I feel  that  a few  words  are  in  order  on  the  sub- 
ject of  ’’true  value,”  Earlier  in  discussing  the  bias  of  a 
direct -measurement  process  I remarked  that  the  bias  is  defined 
to  be  the  difference*  say  * jx  » V $ between  the  value  jtx  that  the 
process  measures  — its  limiting  mean  — and  the  true  value* 
tT,  This  immediately  raises  the  questions  How  is  the  ’’true 


value11  of  a property  or  characteristic  defined? 

To  answer  this  question  we  begin  first  by  noting  with 
P»  Wo  Bridgman  that  a property  or  characteristic  of  the  physi- 
cal world  is  defined,,  In  the  last  analysis , by  specification 
of  a method  of  measuring  Its  quantity? 

uWhat  do  we  mean  by  the  length  of  an  object?  We 
evidently  know  what  we  mean  by  length  if  we  can 
tell  what  the  length  of  any  and  ©very  object  is, 
and  for  the  physicist  nothing  more  is  required. 

To  find  the  length  of  an  objects,  we  have  to  per- 
form certain  physical  operations.  The  concept  of 
length  is  therefore  fixed  when  the  operations  by 
which  length  is  measured  are  fixed ; that  is,  the 
concept  of  length  involves  as  much  as  and  nothing 
more  than  a set  of  operations  by  which  length  is 
determined.  In  general , we  mean  by  any  concept 
nothing  more  than  a set  of  operations;  the  concept 
is  synonymous  with  the  corresponding  set  of  opera- 
tions.  If  the  concept  is  physical , as  of  length* 
the  operations  are  actual  physical  operations, 
namely,  those  by  ■which  length  is  measured;  or  if 
the  concept  is  mental,  as  of  mathematical  continu- 
ity, the  operations  are  mental  operations,  namely 
those  by  which  we  determine  whether  a given  aggre- 
gate of  magnitude  is  continuous...  We  must  demand 
that  the  set  of  operations  equivalent  to  any  con- 
cept be  a unique  set,  for  otherwise  there  are 
possibilities  of  ambiguity  in  practical  applica- 
tions which  we  cannot  admit.1’ 

(P.  W.  Bridgman,  The  Logic  of  Modern  Physics, 
Macmillan,  New  York,  19^7 » PP«  5 and  6T) 

It  should  be  clear  to  us  from  what  Bridgman  has  said  that 
if  all  of  you  and  I are  to  agree  on  what  is  meant  by  the  length 
of  this  blackboard  along  its  lower  edge,  then  we  must  first 
come  to  an  agreement  on  a sequence  of  operations  that  is  to  be 
taken  as  defining  the  concept  of  ’’’length”  in  this  case.  This 
done,  the  true  value  of  the  length  of  this  blackboard  along 
its  lower  edge  will  then  be  uniquely  determined  by  the  limiting 
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THE  EVOLUTION  OF  A REAL-LIFE  DIRECT -MEASUREMENT  PROCESS 


FIGUKE  3 . 


€ 


mean  associated  with  the  agreed-upon  procedure,,  In  brief,  as 
W0  Edwards  Deming  has  observed,  the  true  value  of  a character- 


EXBMPLAR  PROCESS  for  measurement  of  the  characteristic 0 (W0 

Edwards  Deming,  Some  Theory  of  Sampling,  John  Wiley  $ Sons, 
New  York,  1950  * pp  0 l5°=17o) 

Dr  o Deming  actually  use  s the  term  ‘-preferred  procedure 
and  notes  that  na  preferred  procedure  is  distinguished  by 
the  fact  that  it  supposedly  gives  or  would  give  results  near- 
est to  what  are  needed  for  a particular  ©ndj  and  also  by  the 
fact  that  it  is  more  expensive,  or  more  time-consuming,  or 
even  impossible  to  carry  out|n  and  nas  a preferred  procedure 
is  always  subject  to  modification  or  obsolescence,  we  are 
forced  to  conclude  that  neither  the  accuracy  nor  the  bias  of 


c i s i on  of  a random  or  stable  procedure,  however,  may  be 
measured  and  known J* 

As  I see  it,  the  evolution  of  a real-life  direct-measure- 
ment process  is  essentially  as  shown  in  Figure  30  From  ex- 
perience we  are  aware  of  recognisable  changes  in  things  going 
on  about  us,  and  we  say  that  these  take  place  with  the  passage 
of  ““  we  have  a '‘feel*'*  for  what  we  mean  by  ^time;15  we 

can  talk  about  it  with  one  another,  etc.|  but  we  find  ourselves 
beset  with  many  difficulties  when  we  try  to  define  exactly  what 
we  mean  by  4ltime0M  If  we  try  very  hard  to  define  ^time,n  we 


?e  can  ever  be  known  in  a logical  sense 0 The  pre- 


will  find  that  we  wind  up  by  agreeing  that  we  accept-  the 
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successive  occurrences  of  certain  recognisable  events  as 
"measuring"  the  passage  of  "time *n  For  example,  we  may  note 
the  accumulation  of  sand  in  the  bottom  chamber  of  an  hourglass, 
or  we  may  count  the  oscillations  of  a pendulum,  as  in  a grand- 
father's clock,  or  the  oscillations  of  an  atomic  system  as  in 
an  "atomic  clock 0W  Similarly,  with  regard  to  length „ W©  are 
aware  of  the  experience  of  the  separation  of  objects  in  space* 
From  our  study  of  Euclid  we  feel  that  we  hair©  "knowledge"  of 
the  meaning  of  the  "length"  of  a line,  that  is,  of  the  distance 
between  two  points j but  remember  that  Euclid  was  writing  ab- 
stract mathematics  and  not  physics*  If  we  ask  so  simple  a 
question  as  what  is  the  "length"  of  this  blackboard  along  its 
bottom  edge,  we  find  that  all  seems  to  be  very  clear  until  we 
begin  to  get  piekayune  about  it*  If  our  knowledge  of  physics 
goes  only  so  far  as  awareness  of  molecules,  we  may  say  that 
the  "outside  point’9  at  the  left  end  Is  by  definition  the  out- 
sidemost  point  on  the  leftmost  molecule*  Then  someone  rushes 
Into  the  room  and  tells  us  about  the  Mendelian  theory  of  in- 
heritance* We  listen  attentively,  turn  this  new  idea  over  in 
our  minds,  and  say  to  ourselves,  "Very  interesting,  but  it 
does  not  seem  to  have  any  bearing  on  how  we  ought  to  define 
length* 99  Then,  we  are  told,  perhaps  about  atomic  theory,  and 
this  definitely  worries  us,  for  we  see  that  we  will  now  have 
to  define  the  endpoint  on  the  left  as  the  center,  say,  of  the 
leftmost  electron  of  the  outermost  atom  — but  this  brings  up 
new  difficulties,  for  the  electron  is  whirling  around  and  not 


i 


30 


staying  in  any  one  place p so  we  will  have  to  take  the  outer- 
most  point  of  its  path*  But  how  are  w©  going  to  see  it,  be- 
cause if  we  shine  some  light  over  here  8 the  light  11  waves 11  or 
^particles  according  to  what  you  are  believing  this  morning, 
will  push  the  electrons  out  of  their  normal  courses  and  so 
when  we  try  to  see  the  outside  point  we  move  it»  Clearly  this 
is  getting  us  nowhere  fasti 

We  therefore  pull  ours elves  up  to  a halt  with  a reminder 
that  we  wanted  to  know  the  ^length8®  of  the  blackboard  along 
the  bottom  edge  there  for  a purpose  % so  that  we  could  cut  off 
a board  of  the  right  wlengthM  and  nail  it  on  there  to  form  a 
chalk  tray,  We  then  call  on  experience  again 0 We  have  seen 
yardsticks  or  meterstieks  lying  around,  and  so  from  these  we 
abstract  the  concept  of  a rigid  bar  ideally  marked  out  with 
graduations  that  are  perfectly  equidistant  apart 9 the  space 
between  tpw©  such  graduations  being  one  unit  of  95lengthe** 

Pine  I All  we  have  to  do  now  is  count  how  many  units  of  length 
there  are  in  the  bottom  edge  of  this  blackboard e So  we  try 
it.  We  find  that  it  is  between  205  and  206  units,  say»  A 
unit  is  rather  large,  so  that  a half  or  a quarter  unit  would 
look  unsightly  sticking  out  endwise  on  the  blackboard,  so  we 
subdivide  our  wunit,w  or  use  some  other  Ingenious  means  of 
getting  a finer  distinction*  The  purpose  at  hand  determines 
how  far  we  attempt  to  proceed  with  this  business  of  making 
finer  and  finer  distinctions 0 If  our  bar  has  been  lying  on 
the  radiator  so  that  each  of  the  9,unitsn  on  it  has  grown  a 


little  and  is  really  bigger  than  a real  unit  s and  the  lumber 
is  outdoors  in  the  cold5  so  that  our  bar  will  shrink  and  have 
uunitsn  that  are  really  too  small  when  we  take  it  outdoors 
to  measure  the  lumber , then  we  apply  temperature  corrections 
based  on  physical  theory «,  And  so  forth  and  so  on,  I believe 
you  get  the  idea„  I can  hardly  make  it  more  precise  than 
this  in  the  time  I have  here 0 

The  important  thing  is  to  let  the  purpose  determine  the 
refinement  that  you  are  willing  to  g®  to.  And  one  should 
keep  in  mind  not  only  the  immediate  purpose  but  also  other 
uses  to  which  the  result  might  be  put 0 

For  example 9 I am  told  that  the  makers  of  topographic 
maps  from  on- the -ground  surveys  generally  draw  in  the  contour 
lines  by  intuition  and  the  general  ,9feel,f  of  the  landscape  as 
they  stand  there  looking  at  it#  The  resulting  contours  may 
be  adequate  for  some  purposes s but  not  for  all.  The  following 
story  has  always  amused  me  in  this  connections  The  elevation 
of  Shongum  Lake  in  New  Jersey  was  determined  a good  many  years 
ago  by  Co  Co  Vermeule  to  be  698  feet.  On  the  other  side  of 
the  mountain  east  of  the  lake,*  he  sketched  in  a ravine  using 
uniformly  interpolated  contours  between  two  roads  along  which 
levels  were  run,  and  which  the  ravine  cuts  across 0 Years 
later  a reservoir  was  built  in  this  ravine  to  supply  the  insane 
asylum  near  Morris  Plains p New  Jersey;  and  in  subsequent  maps 
the  reservoir  was  sketched  inc  This  reservoir p from  the  adja- 
cent contours  on  such  maps5  had  an  apparent  elevation  of  6i|0 


feeto  In  consequence  someone  got  the  idea  of  building  a 
gravity  feeder  line  from  Shongum  Lake,  via  a nearby  saddle 
point*  to  the  reservoir*  so  that  the  Lake  could  be  used  to 
supplement  the  reservoir*  Unfortunately*  a special  survey 
revealed  the  actual  height  of  the  reservoir  to  be  about  720 
feet  -=*  quit©  a bit  higher  than  the  6Lj.O  feet  suggested  by  the 
contours  — so  that  the  water  from  the  reservoir  would  have* 
in  fact*  flowed  into  the  Lake  I Phot ogramme try * I am  told* 
provides  a means  of  drawing  in  numerous  contours  quite  accu- 
rately*  and  at  much  less  cost  than  by  ground  surveys*  Let9s 
hope  that  you  succeed  in  making  them  accurate  enough  for  all 
uses  likely  to  be  mad#  of  them* 
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SUPPLEMENTARY  DISCUSSION 
(Following  Mr.  Bicking8s  Paper) 

MR „ KATZ  s 

o o o © o 

I wonder  if  Captain  Reading  would  like  to  say  something 
about  this  whole  business 0 I referred  earlier  to  the  fact 
that  9 by  and  large,,  European  photogrammetrists  are  very  much 
concerned  with  statistical  method,,  and  the  theory  of  errors 0 
On  numerous  occasions  they  have  pointed  out  that,,  with  always 
present  exceptions,,  we  over  here  have  not  yet  seen  fit  to 
apply  these  things  to  American  phot ogramme try . 

CAPTAIN  READINGS  That  is  quite  correct.  I think  we  in  photo 
grammetry  in  the  United  States  have  not  had  time  to  study  the 
theory  of  measurement.  We  have  been  mystified  a bit  by  the 
second  diagram  of  the  four  drawn  by  Dr.  Eisenhart.  Dr. 
Eisenhart  seemed  to  prefer  the  narrow  range  with  bias  to  a 
truer  average  with  wider  dispersed  values. 

There  has  been  a tremendous  amount  of  publication  and 
discussion  in  the  past  that  has  failed  to  register  or  strike 
home  — largely  because  of  differences  in  terminology  and 
lack  of  agreed-upon  definitions  for  these  terms.  Clarifica- 
tion of  our  terms  would  get  us  talking  the  same  language.  I 
remember  in  1948  the  Europeans  seemed  to  set  great  store  on 
trying  to  run  down  the  difference  between  systematic  and  acci 
dental  errors.  The  theory  seemed  to  be  that  there  was  some 
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distinction  that  should  be  made  between  the  two  and  then  we 
had  the  process  nicely  corrected  and  were  doing  the  most 
efficient  thing.  But  the  reason  we  have  been  skeptical  about 
applying  all  these  mathematical  and  statistical  methods  is 
because  we  were  always  stopped  by  this  type  of  question? 
should  we  spend  our  money  on  first  order  theodolite  observa- 
tions that  are  read  to  one  second*  or  should  we  spend  our 
money  running  photograph  after  photograph  through  a coordinate 
setting  machine  and  repeating  flight  after  flight  in  the  air 
when  we  can  read  say  to  only  12  seconds « Now  most  of  us  have 
said  that  we  would  use  the  theodolite  and  forget  about  the 
trips  through  the  plotting  machine  and  through  the  air. 

But  there  is  a very  real  need  to  know  just  exactly  where 
to  draw  the  line*  because  sometimes  vie  are  up  against  the 
situation  where  one  approach  is  very  expensive  and  the  other 
much  less  soQ 

I would  like  to  ask  Dr 0 Eisenhart  to  clarify  this  pref- 
erence for  the  narrowed  spread  with  bias  over  the  unbiased 
wider  spread.  What  are  the  criteria  you  use  to  know  which  is 
better? 

DR,  EISENHART?  I5d  certainly  better  clear  this  up.  The  com- 
parison and  choice  that  I made  is  really  only  available  when 
you  have  some  very  fine  measurement  procedure  — an  exemplar 
procedure  — that  all  will  agree  yields  — at  least  in  the 
limit  — » the  l*true  'Value'*1  of  the  magnitudes  concerned , You 
see  in  Figure  2 there  is  a mark  on  the  horizontal  axis  to 


indicate  the  **true  value**  and  you  have  to  know  where  that  value 
is  in  order  to  be  able  to  decide  whether  a given  distribution 
of  readings  is  centered  on  that  point*  or  near  to  it*  or  far 
from  it0  For  example*  if  you  are  working  on  certain  kinds  of 
ground  surveys,  there  are  very  likely  some  types  of  distances 
on  the  ground  that  all  will  agree  that  the  fellow  on  the  ground 
with  an  invar  tape  can  measure  more  accurately  than  a fellow 
can  from  an  aerial  photograph  made  from  i|.OsOOG  feet*  If  you 
have  such  a method  of  measurement,  which  may  be  very  expensive 
but  which  professionals  in  the  field  agree  upon  as  giving  you 
what  we  will  call  the  3*true  value,**  then  this  can  serve  as  our 
reference  system*  Suppose  now  that  we  have  two  alternative 
inexpensive  methods  where  one  of  these  tends  to  yield  more 
widely  dispersed  values  but  without  bias  — that  is  individual 
measurements  obtained  by  this  procedure  are  widely  dispersed 
but  as  a group  are  centered  upon  the  true  value*  This  situa- 
tion is  represented  diagrammatic ally  by  the  flatter  curve  in 
Figure  2*  With  this  method  if  you  took,  say,  100  readings, 
the  average  would  be  almost  certain  to  be  very  close  to  the 
true  value | but  a single  reading,  or  even  an  average  of  say 
five  readings,  would  have  a good  chance  of  being  quite  far  off* 
The  other  of  these  inexpensive  methods,  we  assume,  is  charac» 
terized  by  the  taller  of  the  two  curves  in  Figure  2,  that  is, 
individual  readings  by  this  method  are  hardly  dispersed  at  all, 
but  they  have  a little  bias*  Now  it  is  clear  from  the  figure, 

I believe,  that  if  cost  or  time  or  some  other  restriction 


limited  us  to  a single  reading,  then  we  would  have  a much 
better  chance  of  getting  a reading  close  to  the  true  value  by 
using  this  second  method  than  we  would  have  with  the  first 0 
On  the  other  hand,  since  this  second  more  precise  procedure 
has  a small  positive  bias,  the  average  of  a large  number  of 
readings  by  this  method,  will  be  almost  Certain  to  be  too  highc 
Now  here  is  the  important  point*  if  the  cost  per  reading  is 
the  same  for  either  of  these  two  procedures,  and  if  neither 
cost  nor  time  place  a severe  limit  on  the  number  of  readings 
that  you  may  take,  then  there  will  be  a number  of  readings  Hq 
such  that  for  averages  of  hq  or  more  readings  the  probability 
of  such  an  average  being  within  is  of  the  true  value,  for  any 
assigned  s > 0,  will  be  greater,  i0e0  closer  to  1,  for  the 
first  method  — the  one  with  more  widely  dispersed  individual 
values  — than  for  the  seconds  indeed,  as  the  number  of  ob- 
servations averaged  together  tends  to  infinity,  this  probabil- 
ity  will  tend  to  1 for  the  first  method,  and  to  0 for  the 
second  method  for  all  values  of  e less  than  the  magnitude  of 
the  bias  of  the  second  method 0 On  the  other  hand,  as  we  have 

i Ss> 

already  seen,  for  only  single  readings,  or  averages  of  small 
numbers  (much)  less  than  nQ,  the  second  method  may  put  answers 
closer  to  the  bull ? s-eye 0 If  there  is  & difference  in  cost 
per  reading  with  the  two  methods,  then  the  same  principles 
apply,  although  the  computation  may  be  slightly  more  difficult 
if,  say,  one  can  take  ten  readings  with  the  widely-dispersed 
unbiased  method  for  the  same  cost  as  a single  reading  by  the 


narrow  biased  method , then  it  may,  perhaps,  be  the  case  that 
averages  of  five  readings  by  the  narrowly-dispersed  biased 
system  would  have  a greater  probability  of  being  closer  to 
the  true  value  than  would  averages  of  5>0  readings  by  the 
widely-dispersed  unbiased  method;  but  the  situation  may  be 
reversed  in  the  case  of  averages  of  500  readings  by  the  un- 
biased method  with  the  wide  dispersion  and  50  readings  by  the 
biased  system  of  narrow  dispersion,,  I hope  that  I have  suc- 
ceeded in  making  clear  to  you  that  there  are  certainly  cir- 
cumstances where  one  might  do  well  to  choose  the  second  biased 
method  in  consequence  of  its  small  bias  in  relation  to  its 
high  precision,  even  though  it  will  tend  to  shoot  uphill  a bit. 

I don«t  want  you  to  focus  too  much  of  your  attention  on 
this  question  of  biased  versus  unbiased,  for  bias  or  absence 
of  bias  is  only  important  insofar  as  it  affects  accuracy  -» 
the  important  thing  is  accuracy l A method  of  measurement  that 
yields  accurate  determinations  of  a quantity  a also  provides 
a means  of  obtaining  accurate  determinations  of  functions  of 
a„  Thus,  if  p " f(a),  and  a is  an  accurate  determination  of 
a,  then  b = f(a),  the  derived  value,  will  likewise  be  an  ac- 
curate determination  of  p;  but,  in  general,  if  a is  an  unbiased 
determination  of  a,  b will  be  a biased  determination  of  pe 
It  may  be  possible  to  derive  from  a an  unbiased  determination 
of  p,  say  b%  but  b?  may  have  somewhat  less  precision  than  b 


and  possibly,  as  a consequence,  less  accuracy 
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For  example,  suppose  that  the  problem  is  to  determine  the 
area  of  a circle  and  that  this  is  to  be  done  by  measuring  its 
diameter „ Suppose  that  the  true  area  is  SSl  0 Now  suppose 
further  than  your  diameter “measurement  method  is  unbiased , and 
individual  diameter  measurements  d are  normally  distributed 
about  D with  variance  of  a2  o Then,  given  n independent  diam- 
eter  measurements  <±2?  « » 0 , dn,  the  average  of  these  n 

measurements,  cT,  will  be  an  unbiased  estimator  of  D,  with 


variance 


n 


The  square  of  <1,  however,  will  be  an  unbiased 

2 

determination  not  of  D2,  but  of  D2  + i™  | and  be  distributed 

about  this  latter  magnitude  with  variance  under 

n nd 


the  assumption  that  the  dffs  are  normally  distributed  about  D, 
Since 


s2  ~ 1 (d“d)~/(n-l) 

i-1 

is  an  unbiased  estimator  of  a-,  one  could  obtain  an  unbiased 

determination  of  D2  by  subtracting  from  the  square  of  ds 

n 

but  the  price  of  this  adjustment  would  be  to  add  to  the  fore- 
going expression  for  the  variance  of  the  square  of  d,  an 

p b- 

additional  term  ■ „ Since  n or  some  higher  power  of  it 

n2 (n-1) 

appears  in  the  denominator  of  all  of  the  terms  of  the  forego- 
ing expressions  except  the  term  in  D2  alone  in  the  first  ex- 
pression, it  is  clear  that  these  problems  of  bias  and  correc- 
tions for  it  (with  a consequent  inflation  of  the  variance)  all 
become  unimportant  if  we  use  a large  enough  number  of 
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observations  n in  the  first  place.  Furthermore , even  for 
small  values  of  n we  might  not  wish  to  apply  the  foregoing 
correction  to  the  square  of  d for  the  following  reasons  If, 
as  we  have  assumed,  the  distribution  of  individual  diameter  (Q 
determinations  d is  symmetrical  with  respect  to  the  true 
diameter  D = we  have  actually  assumed  that  the  distribution 
of  d is  normal  about  D — values  of  cf  greater  than  D and  values 
of  d less  than  D will  occur  equally  often  in  practice  in  the 
long  run.  Also,  for  any  sensible  combination  of  o and  n,  neg- 
ative values  of  d ought  not  to  occur  in  practice,  so  that  the 
square  of  d will  exceed  D ^ as  often  as  it  is  less  than  in 
the  long  run.  In  consequence,  we  may  say  the  7t(d)^/I| . Is  a 
probability-wise  unbiased  estimator  of  the  true  area. 

CAPTAIN  READING?  I wonder  if  you  would  clarify  the  distinc- 
tion  between  the  types  of  errors  that  you  mentioned  — pre- 
sumably when  you  had  $ome  means  of  measuring,  of  determining 
the  bias,  you  have  what  we  call  a systematic  error,  and  you 
can  eliminate  it  if  you  have  some  means  of  measuring  th©  factor 
that  put  that  error  in  there 0 Observations  also  contain  what 
the  Europeans  call  accidental  errors,  which  they  try  to  dis- 
tinguish and  separate  out.  I think  that  there  is  here  some 
distortion  of  terms.  I wonder  if  you'd  care  to  clarify  the  % 
definition  of  them.  It  seems  to  me  that  you  left  out  some- 
thing about  accidental  errors.  I think  that  the  terms  random 
errors  and  gross  errors  instead  of  accidental  errors  might 
clarify  the  discussion  a lot.  Would  you  mind  commenting  on 


this? 
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DR „ EJSENHARTs  1 used  the  term  "random  errors"  to  denote  what 
are  often  termed  "accidental  errors"  because  I feel  that  the 


term  "random"  serves  better  than  does  "accidental"  to  signify 
the  basic  concept  involved.  It  is  my  understanding  that  "gross 
errors"  is  the  term  used  to  signify  deviations  caused  by  actual 
blunders  or  mistakes  of  the  observer,  or  by  departures  from 
the  ideal  operation  of  the  process  of  measurement  resulting 
from  inattention  of  the  observer  to  explicit  or  implicit  re- 
quirements of  the  instructions.  They  are  primarily  "observer 
errors,"  "Gross  errors"  that  are  discovered  tend  to  be  large 
in  magnitude  — hence  the  term, 

"Systematic  errors"  differ  in  my  understanding  from  "gross 
errors"  through  their  tendency  to  persist  through  several  suc- 
cessive measurements,  or  perhaps,  through  an  entire  series  of 
measurements,  whereas  the  more  usual  types  of  gross  errors  are 
sporadic  in  occurrence  and  affect  only  a single  observation 
or  only  a small  number  of  successive  measurements.  The  term 
"systematic  errors"  appears  to  be  the  general  term  for  errors 
that  manifest  themselves  as  shifts  in  the  central  position  of 
the  readings,  as  trends  in  the  readings,  or  as  oscillatory 
variations  in  the  position  of  the  readings  over  long  or  short 
periods.  Systematic  errors  that  persist  throughout  an  entire 
series  of  readings  are  more  properly  termed  "constant  errors ,w 
A "constant  error"  may  affect  just  one  particular  series 
of  measurements  by  a given  direct “measurement  process,  arising 
through  some  fault  in  the  execution  of  the  instructions  on 


that  occasion „ If,  on  the  other  hand,,  the  ‘"constant  error15 
is  in  fact  “an  error  of  method,55  resulting  in  the  displace- 
ment of  the  limiting  mean yu  for  that  method  even  when  properly 
applied , from  the  true  value  V 6 then  it  is  in  effect  a source 
of  bias  o The  existence  of  constant  errors  can  at  least  in 
principle  be  detected  by  taking  several  series  of  measurements 
at  quite  different  times  on  standard  material  ; but  bias  which 
is  a characteristic  of  the  method  of  measurement  itself  cannot 
be  detected  except  by  comparison  of  results  obtained  by  dif- 
ferent methods  of  measurement  that  are  assumed  (when  approp- 
riate corrections  have  been  applied)  to  measure  the  same 
quantity . 

Finally,  experience  shows  that  in  the  absence  of  gross 
errors  and  when  all  available  sources  of  systematic  error  have 
been  removed,  or  their  effects  eliminated  by  appropriate  cor- 
rections, a sequence  of  measurements  nevertheless  exhibits 
fluctuations  that  may  be  considered  to  be  the  manifestation 
of  the  inherent  vicissitudes  of  many  minor  uncontrolled  factors 
that  the  errors  thus  generated  are  unpredictable;  and  that 
their  causes  defy  diagnosis  and  eradication.  These  nrock- 
bottom’5  variations  behave  like  a series  of  random  drawings  out 
of  a formal  mathematical  “urn”  such  as  are  considered  in  the 
theory  of  probability,  and  are  in  consequence  termed  ’"random 
errors.’5  It  is  only  to  these  random  errors  that  the  notion  of 
a 15 law  of  error’5  strictly  applies.  Furthermore,  precision  has 


to  do  with  the  effect  of  these  random  errors.  The  probable 
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error  or  standard  error  of  an  average  of  the  number  of  read- 
ings serves  only  to  indicate  the  character  of  the  uncertainty 
in  this  average  resulting  solely  from  the  effects  of  the 
random  errors 0 

It  Is  my  opinion  that  in  some  quarters  there  is  a tendency 
to  incorporate  In  the  computation  of  probable  errors  or  stan- 
dard errors , components  of  error  that  are  estimated  magnitudes 
of  the  constant  error  or  bias  present j and  that  these  should 
be  reported  separately  and  not  built  into  the  probable  error » 
Consider  the  following  examples  In  a particular  direct- 
measurement  process  the  limiting  mean  yu  is  sensitive  to  the 
actual  diameter  of  a certain  wire  where  it  passes  through  a 
hole  in  the  apparatus „ The  experimenter s realizing  that  the 
limiting  mean  depended  upon  the  diameter  of  the  wire,  but  not 
realizing  how  sensitive  it  was  to  this  quantity 9 applied  the 
necessary  correction  to  the  average  of  his  measurements 9 using 
the  nominal  diameter  of  the  wire  as  given  on  the  spool  0 Some 
time  later  he  discovered  that  variations  in  the  diameter  of 
the  wire  of  the  size  such  as  occur  along  the  wire  from  a single 
spool  are  sufficient  in  magnitude  to  make  an  important  effect 
on  the  result o Unf ortunately , he  no  longer  had  at  hand  the 
piece  of  wire  used  in  the  earlier  experiment 0 He 9 therefore, 
measured  the  diameter  of  the  wire  from  his  spool  at  a large 
number  of  points  along  Its  length,  and  determined  upper  and 
lower  limits  which  he  believed  would  bracket  the  true  diameter 
of  the  wire  that  he  actually  used.  Now  it  seems  to  me  that 
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in  such  a case  he  should  simply  compute  and  report  bounds  to 
the  possible  bias  of  the  end  results  of  the  first  experiment 
that  may  have  resulted  from  his  failure  to  make  this  more  re- 
fined diameter  correction,,  The  actual  bias  might  have  been 
zero  — since  he  may  have  been  lucky  and  have  used  a piece  of 
wire  that  actually  had  the  nominal  diameter „ He  will  never 
know  what  his  actual  bias  was , but  he  can  set  limits  on  it0 
Now  I feel  that  these  limits  on  the  possible  bias  should  be 
recorded  as  such*  and  no  attempt  made  to  build  the  added  un- 
certainty into  the  probable  error  of  the  mean  that  he  reported 

i 

earlier „ 

If,  on  the  other  hand,  determination  of  the  actual  diam- 
eter of  the  wire  is  a difficult  thing  to  do,  and  is  so  diffi- 
cult that  he  would  not  plan  to  do  it  habitually  in  the  appli- 
cations of  this  method,  but  instead  would  simply  employ  the 
nominal  diameter  as  a basis  for  his  corrections,  then,  in 
# evaluating  the  precision  of  this  method,  it  would,  I feel,  be 
appropriate  to  compute  a component  of  random  error  arising 
from  deviations  of  actual  diameters  from  the  nominal  in  ran- 
domly chosen  pieces  of  wire,  and  to  incorporate  this  in  his 
computed  or  estimated  probable  error  of  the  procedure c This 
is  a ticklish  matter,  and  I do  not  want  to  go  into  it  any 
further  here,  but  I feel  that  it  does  deserve  careful  atten- 
tion, No  doubt  each  of  you  can  think  of  instances  of  this 
sort  in  your  own  work0 
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